Random forest
Bagging and random forests are “bagging” algorithms that aim to reduce the complexity of models that overfit the training data. In contrast, boosting is an approach to increase the complexity of models that suffer from high bias, that is, models that underfit the training data
Resources
- https://github.com/kjw0612/awesome-random-forest
- https://sebastianraschka.com/faq/docs/bagging-boosting-rf.html
- https://scikit-learn.org/stable/modules/ensemble.html#random-forests
- http://www.listendata.com/2014/11/random-forest-with-r.html
- https://medium.com/rants-on-machine-learning/the-unreasonable-effectiveness-of-random-forests-f33c3ce28883
- In particular, trees that are grown very deep tend to learn highly irregular patterns: they overfit their training set or have low bias, but very high variance. Random forests are a way of averaging multiple deep decision trees, trained on different parts of the same training set, with the goal of reducing the variance. This comes at the expense of a small increase in the bias and some loss of interpretability, but generally greatly boosts the performance of the final model.